Skip to content

Move key-count and dirty-key tracking from TemplateCcMap to CcShard#382

Merged
liunyl merged 3 commits intomainfrom
dirty_size
Jan 29, 2026
Merged

Move key-count and dirty-key tracking from TemplateCcMap to CcShard#382
liunyl merged 3 commits intomainfrom
dirty_size

Conversation

@liunyl
Copy link
Contributor

@liunyl liunyl commented Jan 27, 2026

Replace per-ccm size_/dirty_size_ counters in TemplateCcMap with shard-level counters in CcShard, tracking only data tables (meta tables excluded).

  • Remove size_ and dirty_size_ from TemplateCcMap
  • Update all mutation points to use AdjustDataKeyStats()
  • Update reporting to use GetDataKeyStats()
  • Add underflow assertions

Enables efficient shard-level dirty ratio reporting without per-ccm iteration.

Here are some reminders before you submit the pull request

  • Add tests for the change
  • Document changes
  • Reference the link of issue using fixes eloqdb/tx_service#issue_id
  • Reference the link of RFC if exists
  • Pass ./mtr --suite=mono_main,mono_multi,mono_basic

Summary by CodeRabbit

  • New Features

    • Entry-level dirty indicator and per-entry flush/commit hooks to improve MVCC and statistics accuracy.
    • Public APIs to adjust and read shard data-key counts.
  • Improvements

    • Memory reports now include per-core dirty-key counts and dirty-key ratios.
    • Exposed counter for dirty keys freed to improve observability.
    • Flush/commit flows emit extra notifications to keep key statistics consistent.
  • Other

    • Internal map-printing logic disabled (no-op).

✏️ Tip: You can customize this high-level summary in your review settings.

@coderabbitai
Copy link

coderabbitai bot commented Jan 27, 2026

Walkthrough

Added per-entry dirty detection and optional CcMap hooks for flush/commit events, moved per-map size bookkeeping to shard-level data-key statistics, propagated dirty-state through commit/flush flows, refactored checkpoint-update requests to accept TableName, and exposed per-shard and per-core data-key statistics.

Changes

Cohort / File(s) Summary
CcMap hook interface
tx_service/include/cc/cc_map.h
Removed pure-virtual size() const; added virtual void OnEntryFlushed(bool was_dirty, bool is_persistent) with default empty implementation.
Template & object map updates
tx_service/include/cc/template_cc_map.h, tx_service/include/cc/object_cc_map.h
Replaced per-map size_ bookkeeping with shard-level AdjustDataKeyStats and added hooks OnCommittedUpdate/OnFlushed/OnEntryFlushed; propagate was_dirty through many commit/flush paths.
CcShard stats API & impl
tx_service/include/cc/cc_shard.h, tx_service/src/cc/cc_shard.cpp
Added data_key_count_ and dirty_data_key_count_; added AdjustDataKeyStats(const TableName&, int64_t, int64_t) and GetDataKeyStats() const.
Entry dirty detection
tx_service/include/cc/cc_entry.h, tx_service/src/cc/cc_entry.cpp
Added bool IsDirty() const to VersionedLruEntry to determine dirty state used by hooks and accounting.
Checkpoint request refactor
tx_service/include/cc/cc_req_misc.h, tx_service/src/cc/cc_req_misc.cpp, tx_service/src/cc/local_cc_shards.cpp
UpdateCceCkptTsCc constructor now takes TableName; batch processing fetches CcMap by table and calls OnEntryFlushed(...) per flushed entry; call sites updated.
Per-core & reporting changes
tx_service/include/cc/cc_request.h
Added per-core vectors total_key_cnt_vec_ and dirty_key_cnt_vec_ in CkptTsCc; populated from CcShard::GetDataKeyStats() and used to report per-core dirty-key ratios.
Page-clean accounting
tx_service/include/cc/cc_page_clean_guard.h
Added dirty_freed_cnt_ and public DirtyFreedCount(); incremented when freeing dirty keys in MarkClean.
Local shards printing / API call-site
tx_service/include/cc/local_cc_shards.h, tx_service/src/cc/local_cc_shards.cpp
PrintCcMap() body commented out (no-op); updated FlushData() usage to new UpdateCceCkptTsCc signature.

Sequence Diagram

sequenceDiagram
    participant Executor as Flush / Commit Logic
    participant CcMap as CcMap / TemplateCcMap
    participant CcShard as CcShard
    participant Stats as CkptTsCc (reporter)

    Executor->>CcMap: update CommitTs / CkptTs on entries
    Note over Executor: compute was_dirty = entry.IsDirty()
    Executor->>CcMap: OnEntryFlushed(was_dirty, is_persistent)
    CcMap->>CcShard: AdjustDataKeyStats(table_name, size_delta, dirty_delta)
    CcShard-->>CcShard: update data_key_count_ / dirty_data_key_count_
    Executor->>Stats: notify batch complete / request stats
    Stats->>CcShard: GetDataKeyStats()
    CcShard-->>Stats: (total_keys, dirty_keys)
    Stats->>Stats: compute/log per-core dirty-key ratio
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Suggested reviewers

  • thweetkomputer
  • xiexiaoy
  • githubzilla

Poem

🐇 I hopped through maps and counted every key,
Marked dirty carrots and set the stats free.
Shards tally seeds while I nibble a turnip,
Flush hooks hum softly—no glitch in my burrowed zip.
Hooray for tidy maps and a crunchy data treat!

🚥 Pre-merge checks | ✅ 1 | ❌ 2
❌ Failed checks (2 warnings)
Check name Status Explanation Resolution
Description check ⚠️ Warning The description explains the main change and approach, but the PR checklist items are all unchecked and appear to be incomplete, particularly regarding tests, documentation, issue references, and test suite verification. Complete the PR checklist: add tests, document changes, reference the related issue using the format fixes eloqdb/tx_service#issue_id, reference RFC if applicable, and verify passing the test suite.
Docstring Coverage ⚠️ Warning Docstring coverage is 10.53% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (1 passed)
Check name Status Explanation
Title check ✅ Passed The title clearly and concisely summarizes the main change: moving key-count and dirty-key tracking from TemplateCcMap to CcShard, which aligns with the primary objective of the changeset.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch dirty_size

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
tx_service/include/cc/template_cc_map.h (1)

8665-8681: Guard dirty-key stats when evicting a single entry

CleanEntry always decrements total keys but never decrements dirty keys. If a dirty (non‑persistent) entry ever reaches this path, dirty stats will drift. A defensive decrement makes this safe either way.

♻️ Proposed fix
         ccpage->Remove(cce);
-        shard_->AdjustDataKeyStats(table_name_, -1, 0);
+        const bool was_dirty =
+            (cce->CommitTs() > 1 && !cce->IsPersistent());
+        shard_->AdjustDataKeyStats(table_name_, -1, was_dirty ? -1 : 0);
🧹 Nitpick comments (2)
tx_service/include/cc/local_cc_shards.h (1)

353-392: Avoid a silent no‑op in PrintCcMap().

Now the function does nothing, which can confuse callers and removes ordering checks. Consider making the disablement explicit (or gating the original logic behind a debug/flag) so callers know it’s intentionally off.

🔧 Minimal clarification
 void PrintCcMap()
 {
+    // Intentionally disabled to avoid expensive per‑ccm iteration at runtime.
+    // Use shard stats (e.g., CcShard::GetDataKeyStats()) for reporting.
     // std::unordered_map<TableName, size_t>
     //     mapsizes;  // not string owner, sv -> native_ccms_
     // ...
 }
tx_service/src/cc/local_cc_shards.cpp (1)

5842-5873: Use the entry’s TableName when issuing UpdateCceCkptTsCc.

Now that UpdateCceCkptTsCc is table-aware, deriving table_name from entries.front() assumes all entries in this bucket share the same logical table. Using entry->data_sync_task_->table_name_ avoids that coupling and reads clearer.

♻️ Suggested tweak
-                auto &table_name =
-                    entries.front()->data_sync_task_->table_name_;
+                const TableName &table_name =
+                    entry->data_sync_task_->table_name_;

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
tx_service/include/cc/template_cc_map.h (1)

8674-8681: Adjust dirty stats when CleanEntry removes dirty entries.
CleanEntry can remove entries that are both IsFree() and dirty (CommitTs > 1 && !IsPersistent). Currently line 8680 always passes 0 for the dirty_delta, which will silently lose the dirty count if a dirty entry is removed. Defensively check the entry's dirty status:

Fix
-        ccpage->Remove(cce);
-        shard_->AdjustDataKeyStats(table_name_, -1, 0);
+        ccpage->Remove(cce);
+        const bool was_dirty =
+            (cce->CommitTs() > 1 && !cce->IsPersistent());
+        shard_->AdjustDataKeyStats(table_name_, -1, was_dirty ? -1 : 0);
🧹 Nitpick comments (1)
tx_service/include/cc/local_cc_shards.h (1)

353-392: Consider removing the commented-out PrintCcMap body or rewiring it.
Leaving a large commented block makes the function a no-op and harder to maintain. If intentionally disabled, delete the dead code or replace with a short stub comment; if still needed, consider using shard-level stats instead.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Fix all issues with AI agents
In `@tx_service/include/cc/object_cc_map.h`:
- Around line 1692-1695: The code calls this->OnFlushed(cce, was_dirty)
unconditionally even though SetCkptTs is only invoked when req.Kind() ==
UploadBatchType::DirtyBucketData; change the logic so OnFlushed is only called
when the checkpoint timestamp was actually updated (i.e., guard the
OnFlushed(cce, was_dirty) call behind the same condition that calls SetCkptTs),
while leaving this->OnCommittedUpdate(cce, was_dirty) executed as before;
reference SetCkptTs, OnFlushed, OnCommittedUpdate, req.Kind(),
UploadBatchType::DirtyBucketData, cce, and was_dirty to locate and update the
code path.

Comment on lines +1692 to +1695
// Since we have updated both ckpt ts and commit ts, we need to call
// OnFlushed to update the dirty size.
this->OnFlushed(cce, was_dirty);
this->OnCommittedUpdate(cce, was_dirty);
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

OnFlushed called unconditionally, but SetCkptTs is conditional.

The comment states "we have updated both ckpt ts and commit ts," but SetCkptTs at line 1660 is only called when req.Kind() == UploadBatchType::DirtyBucketData. Calling OnFlushed when the checkpoint timestamp wasn't updated will cause incorrect dirty-size accounting.

🐛 Proposed fix
-            // Since we have updated both ckpt ts and commit ts, we need to call
-            // OnFlushed to update the dirty size.
-            this->OnFlushed(cce, was_dirty);
+            if (req.Kind() == UploadBatchType::DirtyBucketData)
+            {
+                // Since we have updated both ckpt ts and commit ts, we need to call
+                // OnFlushed to update the dirty size.
+                this->OnFlushed(cce, was_dirty);
+            }
             this->OnCommittedUpdate(cce, was_dirty);
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
// Since we have updated both ckpt ts and commit ts, we need to call
// OnFlushed to update the dirty size.
this->OnFlushed(cce, was_dirty);
this->OnCommittedUpdate(cce, was_dirty);
if (req.Kind() == UploadBatchType::DirtyBucketData)
{
// Since we have updated both ckpt ts and commit ts, we need to call
// OnFlushed to update the dirty size.
this->OnFlushed(cce, was_dirty);
}
this->OnCommittedUpdate(cce, was_dirty);
🤖 Prompt for AI Agents
In `@tx_service/include/cc/object_cc_map.h` around lines 1692 - 1695, The code
calls this->OnFlushed(cce, was_dirty) unconditionally even though SetCkptTs is
only invoked when req.Kind() == UploadBatchType::DirtyBucketData; change the
logic so OnFlushed is only called when the checkpoint timestamp was actually
updated (i.e., guard the OnFlushed(cce, was_dirty) call behind the same
condition that calls SetCkptTs), while leaving this->OnCommittedUpdate(cce,
was_dirty) executed as before; reference SetCkptTs, OnFlushed,
OnCommittedUpdate, req.Kind(), UploadBatchType::DirtyBucketData, cce, and
was_dirty to locate and update the code path.

Replace per-ccm size_/dirty_size_ counters in TemplateCcMap with shard-level
counters in CcShard, tracking only data tables (meta tables excluded).

- Remove size_ and dirty_size_ from TemplateCcMap
- Update all mutation points to use AdjustDataKeyStats()
- Update reporting to use GetDataKeyStats()
- Add underflow assertions

Enables efficient shard-level dirty ratio reporting without per-ccm iteration.
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Fix all issues with AI agents
In `@tx_service/include/cc/cc_entry.h`:
- Around line 632-641: Update the IsDirty() doc to explicitly state that entries
with CommitTs() <= 1 are treated as non-dirty (i.e., IsDirty() returns false)
even if the non-versioned flush bit isn’t set; mention that this special-case
mirrors the code path that checks CommitTs(), CkptTs(), and
commit_ts_and_status_ for initial entries and is intended for dirty size
tracking and metrics/tests to avoid misinterpretation.
🧹 Nitpick comments (4)
tx_service/include/cc/cc_request.h (2)

3157-3172: Guard frag ratio when committed == 0 to avoid inf output.
This keeps logs stable if a core reports zero committed memory.

Proposed tweak
-            LOG(INFO) << "ccs " << core_id << " memory usage report, committed "
-                      << committed << ", allocated " << allocated
-                      << ", frag ratio " << std::setprecision(2)
-                      << 100 * (static_cast<float>(committed - allocated) /
-                                committed)
+            const double frag_ratio =
+                committed == 0
+                    ? 0.0
+                    : 100.0 *
+                          (static_cast<double>(committed - allocated) /
+                           static_cast<double>(committed));
+            LOG(INFO) << "ccs " << core_id << " memory usage report, committed "
+                      << committed << ", allocated " << allocated
+                      << ", frag ratio " << std::setprecision(2) << frag_ratio
                       << " , heap full: " << heap_full
                       << ", dirty key ratio: " << std::setprecision(4)
                       << dirty_ratio << " (" << dirty_keys << "/" << total_keys
                       << ")";

3220-3221: Prefer 64-bit counters for key stats to avoid narrowing on 32-bit builds.

Proposed tweak
-    std::vector<size_t> total_key_cnt_vec_;
-    std::vector<size_t> dirty_key_cnt_vec_;
+    std::vector<uint64_t> total_key_cnt_vec_;
+    std::vector<uint64_t> dirty_key_cnt_vec_;
tx_service/include/cc/local_cc_shards.h (1)

353-394: PrintCcMap is now a no-op — consider removing or wiring to shard stats.

Keeping a large commented-out block hurts maintainability and drops useful diagnostics. If you still want lightweight visibility, consider summarizing shard-level data-key stats instead.

♻️ Minimal replacement using shard-level stats
 void PrintCcMap()
 {
-    // std::unordered_map<TableName, size_t>
-    //     mapsizes;  // not string owner, sv -> native_ccms_
-    // for (const auto &cc_shard : cc_shards_)
-    // {
-    //     CcShard &shard = *cc_shard;
-    //
-    //     for (auto map_iter = shard.native_ccms_.begin();
-    //          map_iter != shard.native_ccms_.end();
-    //          ++map_iter)
-    //     {
-    //         const TableName &tab_name = map_iter->first;
-    //         auto find_iter = mapsizes.find(tab_name);
-    //         size_t cnt = 0;
-    //         if (find_iter != mapsizes.end())
-    //         {
-    //             cnt = find_iter->second;
-    //             cnt += map_iter->second->size();
-    //             find_iter->second = cnt;
-    //         }
-    //         else
-    //         {
-    //             mapsizes.emplace(
-    //                 std::piecewise_construct,
-    //                 std::forward_as_tuple(tab_name.StringView(),
-    //                                       tab_name.Type(),
-    //                                       tab_name.Engine()),
-    //                 std::forward_as_tuple(map_iter->second->size()));
-    //         }
-    //
-    //         std::cout << "Table '" << tab_name.StringView() << "' core ID "
-    //                   << shard.core_id_ << ": " <<
-    //                   map_iter->second->size()
-    //                   << std::endl;
-    //
-    //         assert(map_iter->second->VerifyOrdering() ==
-    //                map_iter->second->size());
-    //     }
-    // }
+    size_t total_data_keys = 0;
+    size_t total_dirty_keys = 0;
+    for (const auto &cc_shard : cc_shards_)
+    {
+        auto [data_keys, dirty_keys] = cc_shard->GetDataKeyStats();
+        total_data_keys += data_keys;
+        total_dirty_keys += dirty_keys;
+    }
+    std::cout << "Data keys: " << total_data_keys
+              << ", dirty: " << total_dirty_keys << std::endl;
 
     /*for (auto map_iter = mapsizes.begin(); map_iter != mapsizes.end();
          ++map_iter)
     {
         std::cout << map_iter->first << ": " << map_iter->second
                   << std::endl;
     }*/
 }
tx_service/src/cc/cc_shard.cpp (1)

380-404: Assertions are disabled in release builds; underflow would corrupt counters.

"If NDEBUG is defined as a macro name at the point in the source code where or <assert.h> is included, the assertion is disabled: assert does nothing." "Common pratice is to include -DNDEBUG in release flags, and that's what you get by default from CMake."

In release builds, if a negative delta exceeds the current counter value, the size_t will wrap around to a very large value, corrupting data_key_count_ or dirty_data_key_count_. This could lead to incorrect dirty-ratio reporting and unpredictable behavior in memory management or checkpointing.

Consider adding runtime-safe guards that also work in release builds:

🛡️ Proposed fix to add release-safe underflow protection
 void CcShard::AdjustDataKeyStats(const TableName &table_name,
                                  int64_t size_delta,
                                  int64_t dirty_delta)
 {
     if (table_name.IsMeta())
     {
         return;
     }
 
     if (size_delta != 0)
     {
-        assert(size_delta >= 0 ||
-               data_key_count_ >= static_cast<size_t>(-size_delta));
+        if (size_delta < 0 &&
+            data_key_count_ < static_cast<size_t>(-size_delta))
+        {
+            LOG(ERROR) << "data_key_count_ underflow: current="
+                       << data_key_count_ << ", delta=" << size_delta;
+            data_key_count_ = 0;
+        }
+        else
+        {
             data_key_count_ = static_cast<size_t>(
                 static_cast<int64_t>(data_key_count_) + size_delta);
+        }
     }
 
     if (dirty_delta != 0)
     {
-        assert(dirty_delta >= 0 ||
-               dirty_data_key_count_ >= static_cast<size_t>(-dirty_delta));
+        if (dirty_delta < 0 &&
+            dirty_data_key_count_ < static_cast<size_t>(-dirty_delta))
+        {
+            LOG(ERROR) << "dirty_data_key_count_ underflow: current="
+                       << dirty_data_key_count_ << ", delta=" << dirty_delta;
+            dirty_data_key_count_ = 0;
+        }
+        else
+        {
             dirty_data_key_count_ = static_cast<size_t>(
                 static_cast<int64_t>(dirty_data_key_count_) + dirty_delta);
+        }
     }
 }

Comment on lines +632 to +641
/**
* @brief Check if the entry is dirty (not yet checkpointed).
* This function only relies on CommitTs and CkptTs or commit_ts_and_status_,
* and does not consider standby node status. Use this strictly for dirty
* size tracking.
*
* @return true if the entry is dirty (CommitTs > CkptTs for versioned,
* or flush bit not set for non-versioned), false otherwise.
*/
bool IsDirty() const;
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Document the CommitTs() <= 1 exclusion.

The implementation returns false for initial entries even when the flush bit isn’t set, but the comment doesn’t mention this. Please update the doc to avoid misinterpretation in metrics/tests.

✏️ Doc-only fix
-     * `@return` true if the entry is dirty (CommitTs > CkptTs for versioned,
-     *         or flush bit not set for non-versioned), false otherwise.
+     * `@return` true if the entry is dirty (CommitTs > 1 and CommitTs > CkptTs for
+     *         versioned, or CommitTs > 1 and flush bit not set for non-versioned),
+     *         false otherwise.
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
/**
* @brief Check if the entry is dirty (not yet checkpointed).
* This function only relies on CommitTs and CkptTs or commit_ts_and_status_,
* and does not consider standby node status. Use this strictly for dirty
* size tracking.
*
* @return true if the entry is dirty (CommitTs > CkptTs for versioned,
* or flush bit not set for non-versioned), false otherwise.
*/
bool IsDirty() const;
/**
* `@brief` Check if the entry is dirty (not yet checkpointed).
* This function only relies on CommitTs and CkptTs or commit_ts_and_status_,
* and does not consider standby node status. Use this strictly for dirty
* size tracking.
*
* `@return` true if the entry is dirty (CommitTs > 1 and CommitTs > CkptTs for
* versioned, or CommitTs > 1 and flush bit not set for non-versioned),
* false otherwise.
*/
bool IsDirty() const;
🤖 Prompt for AI Agents
In `@tx_service/include/cc/cc_entry.h` around lines 632 - 641, Update the
IsDirty() doc to explicitly state that entries with CommitTs() <= 1 are treated
as non-dirty (i.e., IsDirty() returns false) even if the non-versioned flush bit
isn’t set; mention that this special-case mirrors the code path that checks
CommitTs(), CkptTs(), and commit_ts_and_status_ for initial entries and is
intended for dirty size tracking and metrics/tests to avoid misinterpretation.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Fix all issues with AI agents
In `@tx_service/src/cc/cc_entry.cpp`:
- Around line 82-102: In VersionedLruEntry<Versioned,
RangePartitioned>::IsDirty(), short‑circuit persisted shared‑storage entries by
returning false when IsPersistent() is true so they aren’t counted as dirty;
update IsDirty() to call IsPersistent() (before performing CommitTs/CkptTs or
commit_ts_and_status_ checks) and return false if true, preserving the existing
initial‑entry check (CommitTs() <= 1) and the existing Versioned vs
non‑Versioned logic otherwise.
🧹 Nitpick comments (1)
tx_service/include/cc/object_cc_map.h (1)

1958-1961: Use IsDirty() for consistency across standby-forward paths.

Both paths manually compute was_dirty; using cce->IsDirty() keeps semantics aligned with the rest of the code and future changes.

♻️ Suggested change
-                bool was_dirty = (cce->CommitTs() > 1 && !cce->IsPersistent());
+                bool was_dirty = cce->IsDirty();
                 cce->SetCommitTsPayloadStatus(commit_ts, payload_status);
                 this->OnCommittedUpdate(cce, was_dirty);
-                bool was_dirty = (cce->CommitTs() > 1 && !cce->IsPersistent());
+                bool was_dirty = cce->IsDirty();
                 cce->EmplaceAndCommitBufferedTxnCommand(
                     txn_cmd, shard_->NowInMilliseconds());
                 this->OnCommittedUpdate(cce, was_dirty);

Also applies to: 1994-1998

Comment on lines +82 to +102
template <bool Versioned, bool RangePartitioned>
bool VersionedLruEntry<Versioned, RangePartitioned>::IsDirty() const
{
// Only check CommitTs > 1 to exclude initial entries
if (CommitTs() <= 1)
{
return false;
}

if (Versioned)
{
// For versioned records, dirty means CommitTs > CkptTs
return CommitTs() > CkptTs();
}
else
{
// For non-versioned records, dirty means the flush bit (5th bit) is not
// set
return !(entry_info_.commit_ts_and_status_ & 0x10);
}
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Guard IsDirty for shared-storage persisted entries.

IsPersistent() returns true on standby + shared storage, but IsDirty() ignores that, so dirty stats can be inflated even when entries are treated persisted. Consider short‑circuiting on IsPersistent().

🛠 Proposed fix
 bool VersionedLruEntry<Versioned, RangePartitioned>::IsDirty() const
 {
     // Only check CommitTs > 1 to exclude initial entries
     if (CommitTs() <= 1)
     {
         return false;
     }
+    if (IsPersistent())
+    {
+        return false;
+    }
 
     if (Versioned)
     {
         // For versioned records, dirty means CommitTs > CkptTs
         return CommitTs() > CkptTs();
     }
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
template <bool Versioned, bool RangePartitioned>
bool VersionedLruEntry<Versioned, RangePartitioned>::IsDirty() const
{
// Only check CommitTs > 1 to exclude initial entries
if (CommitTs() <= 1)
{
return false;
}
if (Versioned)
{
// For versioned records, dirty means CommitTs > CkptTs
return CommitTs() > CkptTs();
}
else
{
// For non-versioned records, dirty means the flush bit (5th bit) is not
// set
return !(entry_info_.commit_ts_and_status_ & 0x10);
}
}
template <bool Versioned, bool RangePartitioned>
bool VersionedLruEntry<Versioned, RangePartitioned>::IsDirty() const
{
// Only check CommitTs > 1 to exclude initial entries
if (CommitTs() <= 1)
{
return false;
}
if (IsPersistent())
{
return false;
}
if (Versioned)
{
// For versioned records, dirty means CommitTs > CkptTs
return CommitTs() > CkptTs();
}
else
{
// For non-versioned records, dirty means the flush bit (5th bit) is not
// set
return !(entry_info_.commit_ts_and_status_ & 0x10);
}
}
🤖 Prompt for AI Agents
In `@tx_service/src/cc/cc_entry.cpp` around lines 82 - 102, In
VersionedLruEntry<Versioned, RangePartitioned>::IsDirty(), short‑circuit
persisted shared‑storage entries by returning false when IsPersistent() is true
so they aren’t counted as dirty; update IsDirty() to call IsPersistent() (before
performing CommitTs/CkptTs or commit_ts_and_status_ checks) and return false if
true, preserving the existing initial‑entry check (CommitTs() <= 1) and the
existing Versioned vs non‑Versioned logic otherwise.

@liunyl liunyl merged commit a7a327d into main Jan 29, 2026
4 checks passed
@liunyl liunyl deleted the dirty_size branch January 29, 2026 09:30
@coderabbitai coderabbitai bot mentioned this pull request Feb 11, 2026
5 tasks
@liunyl liunyl self-assigned this Feb 12, 2026
@coderabbitai coderabbitai bot mentioned this pull request Mar 4, 2026
5 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants